Entry date: 20-01-2019
In my first entry, I plotted a scatter and line graph showing the relationship between carbon dioxide emission and economic growth from year 1960 to 2014. This is shown below in Figure 1.
Figure 1:
## Warning: The titlefont attribute is deprecated. Use title = list(font = ...)
## instead.
Here, we observe a positively correlated relationship between both variables – as GDP increase, carbon dioxide emission increase too. However, I’d like to quantitatively identify the strength of this relationship. To do so, I carry out a correlation test called the Pearson product-moment correlation coefficient test.
Measuring the relationship between GDP and carbon dioxide emission
Figure 2:
##
## Pearson's product-moment correlation
##
## data: worldDF$GDP_current and worldDF$CO2_emissions_kT
## t = 20.681, df = 54, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9031098 0.9659048
## sample estimates:
## cor
## 0.9422853
Interpreting the results
The pearson’s correlation coefficient, also denoted as r, indicates the strength of the correlated relationship in numeric terms. Figure 2 shows a coefficient of +0.962. The positive coefficient sign represents an upward slope on the graph, confirming a positively correlated relationship between carbon dioxide emission and GDP - as GDP increases, carbon dioxide emission increases too. The value itself, 0.962, indicates a very strongly correlated relationship. Extreme values of -1 and +1 represent perfectly linear relationships where a unit change in one variable leads to a corresponding unit change in another variable. As a rule of thumb, +0.8 and +0.6 reflect fairly strong and moderately positive relationships, respectively.
From results shown in Figure 2, we can also reject the null hypothesis that states an absence of relationship between carbon dioxide emission and GDP, as the p-value is far less than 0.05 (p<2.2^-16) - the observed relationship is statistically significant.
Correlation does not imply causation
Just because the strength of a relationship is strong, it does not imply that changes in one variable cause changes in the other variable. There may be other factors influencing the supposed relationship between carbon dioxide emission and GDP.
Typically, a randomized, controlled experiment is carried out to determine causal relationships from mere correlations. However, such experiments are simply unfeasible, not to mention unethical, in testing whether or not economic growth causes carbon dioxide emission. Both events are not stand-alone. They are embedded within what I’d imagine to be a complicated framework - one denoting causes and effects of many other world observations on a global scale. For example, economic growth and its common use of GDP as an indicative measurement is simply an umbrella term that is taken to reflect other events, such as expanding construction and manufacturing industries – both of which have been at the forefront of global industrialization. These then fuel economic growth. Other factors, such as socioeconomic transitions, which can be further broken down into production- and consumption-pattern socioeconomic transitions, drive industrialization too. The cycle continues in an upstream and downstream manner.
Slight divergence of thought: Studying global public health has become increasingly interdisciplinary. This is reflected in the emergence of many scholastic subject areas within the domain of global public health, such as studies of urbanization, population health, sustainability, and many more specialized fields of study – all of which cannot be scrutinized in isolation from one another.
What other factors contribute to carbon dioxide emission?
I view the determinants of carbon dioxide emission through multiple levels – not dissimilar to a socio-ecological model that defines a network of interactions between organisms and their environment. Levels are arranged in an order of direct to indirect determinants. Fossil fuel combustion is a direct determinant of carbon dioxide emission, whereas I consider all activities involving fossil fuel combustion as indirect determinants of carbon dioxide emission - economic growth being one idealized endpoint in which fossil fuel combustion greatly facilitates for a wealthier world.
Depending on the perspective that one decides to trail off with, it is difficult to follow the flow of any causal-effect relationship, as theories behind every observed trend rely on different schools of thought.
Let’s quickly explore some other variables that may yield similarly high r-values when plotted against carbon dioxide emission. These were again chosen from World Bank’s built-in R dataset, and in no way are these chosen variables a complete wrap up of the potential determinants of carbon dioxide emission. In fact, it is not inaccurate for readers to deduce my selection as rather arbitrary. Nonetheless, I identified the following variables to reiterate my previous point made, which was simply that correlation does not imply causation, due to an abundance of possible intertwining factors.
Relationship between carbon dioxide emission and population size
Calculated r-values show a strong relationship between global population size and carbon dioxide emission, particularly in regards to urban population size (see Figure 3 and 4).
Figure 3:
##
## Pearson's product-moment correlation
##
## data: worldDF1$pop_size and worldDF1$CO2_emissions_kT
## t = 31.829, df = 55, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9559321 0.9846146
## sample estimates:
## cor
## 0.9739127
Figure 4:
##
## Pearson's product-moment correlation
##
## data: worldDF1$urban_pop and worldDF1$CO2_emissions_kT
## t = 30.571, df = 55, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9524195 0.9833688
## sample estimates:
## cor
## 0.9718127
A mutual dependency on time
An important point to note of the data itself is that it is a time-series data - a set of observations for a variable over a sequence of different times. In our case, both trends of increase observed for carbon dioxide emission and GDP occur over time, across yearly intervals. In other words, we are merely seeing increases in emissions and GDP over time. This is also known as a secular trend, where a particular pattern is not cyclical in nature, but rather, it exists over a long period of time. Thus, the stronger the trend of change of a variable over time, the stronger the effect it has when correlated against another variable. When correlating two individual variables that show similarly strong trends of either increase or decrease over time, the strength of the relationship between both variables would unsurprisingly be very strong, or very weak.
This tells me that my strongly correlated relationship between carbon dioxide emission and GDP may simply be a spurious one – the apparent correlation is not meaningful. This is because time is introduced as an independent variable, with this resulting in strong trends of increase for both carbon dioxide emission and GDP over time. Hence, I’d have to control for such time-dependent trends before exploring the correlated relationship between two variables of time-series data – the within-series dependence of the apparently strong correlation has to be taken into account. If not, the calculated r-value would lead me to misinterpret the relationship between carbon dioxide emission and GDP.